Exploiting Latent Semantic Relations in Highly Linked Hypertext for Information Retrieval in Wikis

نویسندگان

  • Tristan Miller
  • Bertin Klein
  • Elisabeth Wolf
چکیده

Good hypertext writing style mandates that link texts clearly indicate the nature of the link target. While this guideline is routinely ignored in HTML, the lightweight markup languages used by wikis encourage or even force hypertext authors to use semantically appropriate link texts. This property of wiki hypertext makes it an ideal candidate for processing with latent semantic analysis, a factor analysis technique for finding latent transitive relations among naturallanguage documents. In this study, we design, implement, and test an LSA-based information retrieval system for wikis. Instead of a full-text index, our system indexes only link texts and document titles. Nevertheless, its precision exceeds that of a popular full-text search engine, and is comparable to that of PageRank-based systems such as Google.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Semantic Wiki Based on Spatial Hypertext

Spatial Hypertext Wiki (ShyWiki) is a wiki which represents knowledge using notes that are spatially distributed in wiki pages and have visual characteristics such as colour, size, or font type. The use of spatial and visual characteristics in wikis is important to improve human comprehension, creation and organization of knowledge. Another important capability in wikis is to allow machines to ...

متن کامل

A Semantic Spatial Hypertext Wiki

Spatial Hypertext Wiki (ShyWiki) is a wiki which represents knowledge using notes that are spatially distributed in wiki pages and have some visual characteristics such as colour, size, or font type. Spatial and visual characteristics are important in a wiki to improve human comprehension, creation and organization of knowledge. Another important capability in wikis is to allow machines to proc...

متن کامل

Creating and Exploiting a Hybrid Knowledge Base for Linked Data

Twenty years ago Tim Berners-Lee proposed a distributed hypertext system based on standard Internet protocols. The Web that resulted fundamentally changed the ways we share information and services, both on the public Internet and within organizations. That original proposal contained the seeds of another effort that has not yet fully blossomed: a Semantic Web designed to enable computer progra...

متن کامل

Aemoo: Exploratory Search based on Knowledge Patterns over the Semantic Web

Aemoo is a Web application supporting exploratory search over the Semantic Web. Through a simple keyword-based search interface, users can query Aemoo for information about any entity, which is then collected by aggregating knowledge from diverse sources such as linked data, Wikipedia, Twitter, and Google News. Such aggregation is performed according to cognitively-sound principles through the ...

متن کامل

Fifth Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR’12) CIKM 2012 Workshop

There is an increasing amount of structure on the Web as a result of modern Web languages, user tagging and annotation, emerging robust NLP tools, and an ever growing volume of linked data. These meaningful, semantic, annotations hold the promise to significantly enhance information access, by enhancing the depth of analysis of today’s systems. Currently, we have only started exploring the poss...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009